Learning from High Dimensional fMRI Data using Random Projections

نویسنده

  • Madhu Advani
چکیده

The term “the Curse of Dimensionality” refers to the difficulty of organizing and applying machine learning to data in a very high dimensional space. The reason for this difficulty is that as the dimensionality increases, the volume between different training examples increases rapidly and the data becomes sparse and difficult to classify. So, the predictive power of a machine learning algorithm decreases as the dimensionality increases with a fixed number of training examples, which is known as the Hughes effect. One way of dealing with the curse of dimensionality is by projecting data into a lower-dimensional subspace. The statistically optimal way to do this (assuming the data is on or near a linear subspace) is PCA, which projects the data into a subspace that preserves as much of the variation as possible. However, PCA is computationally expensive for high-dimensional data compared to the method of dimensionality reduction through random projections. In fact, the distortion of the data when it is compressed via random projections can be bound by the JL lemma. However, empirical testing of this technique will demonstrate how well it performs in practical machine learning problems. The major benefit of using random projections are that they are a computationally less expensive than PCA particulary when the dimensionality of the data becomes too large for matrix diagonalization to be feasible, as is the case for very high-dimesionality data such as fMRI. The main focus of this paper is applying machine learning algorithms to classify fMRI data and attempting to empirically and theoretically predict the feasibility of applying random projections to the supervised classifying of fMRI data. To that end, we first give a review of some of the applicable theory for random projections, then we empirically examine how logistic regression classification error varies when the data is compressed via random projections. We then explain a theoretical method for estimating an assymptotic bound on the generalization error, estimated using k-fold cross validation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Early detection of MS in fMRI images using deep learning techniques

Introduction & Objective:MS is a disease of the central nervous system in which the body makes a defensive attack on its tissues. The disease can affect the brain and spinal cord, causing a wide range of potential symptoms, including balance, movement and vision problems. MRI and fMRI images are a very important tool in the diagnosis and treatment of MS. The aim of this study was to provide...

متن کامل

Compressive Reinforcement Learning with Oblique Random Projections

Compressive sensing has been rapidly growing as a non-adaptive dimensionality reduction framework, wherein high-dimensional data is projected onto a randomly generated subspace. In this paper we explore a paradigm called compressive reinforcement learning, where approximately optimal policies are computed in a lowdimensional subspace generated from a high-dimensional feature space through rando...

متن کامل

Beyond Parity Constraints: Fourier Analysis of Hash Functions for Inference

Random projections have played an important role in scaling up machine learning and data mining algorithms. Recently they have also been applied to probabilistic inference to estimate properties of high-dimensional distributions; however, they all rely on the same class of projections based on universal hashing. We provide a general framework to analyze random projections which relates their st...

متن کامل

Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction f...

متن کامل

Title : Manifold learning and Random Projections for multi - view object recognition

Recognizing objects from different viewpoints is a challenging task. One approach for handling this task is to model the appearance of an object under different viewing conditions using a low dimensional subspace. Manifold learning describes the process by which this low dimensional embedding can be generated. However, manifold learning is an unsupervised method and thus gives poor results on c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011